Goto

Collaborating Authors

 latent component


A Disentangled Low-Rank RNN Framework for Uncovering Neural Connectivity and Dynamics

Li, Chengrui, Wang, Yunmiao, Wang, Yule, Li, Weihan, Jaeger, Dieter, Wu, Anqi

arXiv.org Artificial Intelligence

Low-rank recurrent neural networks (lrRNNs) are a class of models that uncover low-dimensional latent dynamics underlying neural population activity. Although their functional connectivity is low-rank, it lacks disentanglement interpretations, making it difficult to assign distinct computational roles to different latent dimensions. To address this, we propose the Disentangled Recurrent Neural Network (DisRNN), a generative lrRNN framework that assumes group-wise independence among latent dynamics while allowing flexible within-group entanglement. These independent latent groups allow latent dynamics to evolve separately, but are internally rich for complex computation. We reformulate the lrRNN under a variational autoencoder (VAE) framework, enabling us to introduce a partial correlation penalty that encourages disentanglement between groups of latent dimensions. Experiments on synthetic, monkey M1, and mouse voltage imaging data show that DisRNN consistently improves the disentanglement and interpretability of learned neural latent trajectories in low-dimensional space and low-rank connectivity over baseline lrRNNs that do not encourage partial disentanglement.



Diverse Influence Component Analysis: A Geometric Approach to Nonlinear Mixture Identifiability

Nguyen, Hoang-Son, Fu, Xiao

arXiv.org Artificial Intelligence

Latent component identification from unknown nonlinear mixtures is a foundational challenge in machine learning, with applications in tasks such as disentangled representation learning and causal inference. Prior work in nonlinear independent component analysis (nICA) has shown that auxiliary signals -- such as weak supervision -- can support identifiability of conditionally independent latent components. More recent approaches explore structural assumptions, e.g., sparsity in the Jacobian of the mixing function, to relax such requirements. In this work, we introduce Diverse Influence Component Analysis (DICA), a framework that exploits the convex geometry of the mixing function's Jacobian. We propose a Jacobian Volume Maximization (J-VolMax) criterion, which enables latent component identification by encouraging diversity in their influence on the observed variables. Under reasonable conditions, this approach achieves identifiability without relying on auxiliary information, latent component independence, or Jacobian sparsity assumptions. These results extend the scope of identifiability analysis and offer a complementary perspective to existing methods.


Similarity Component Analysis

Neural Information Processing Systems

Measuring similarity is crucial to many learning tasks. It is also a richer and broader notion than what most metric learning algorithms can model. For example, similarity can arise from the process of aggregating the decisions of multiple latent components, where each latent component compares data in its own way by focusing on a different subset of features. In this paper, we propose Similarity Component Analysis (SCA), a probabilistic graphical model that discovers those latent components from data. In SCA, a latent component generates a local similarity value, computed with its own metric, independently of other components. The final similarity measure is then obtained by combining the local similarity values with a (noisy-)OR gate. We derive an EM-based algorithm for fitting the model parameters with similarity-annotated data from pairwise comparisons. We validate the SCA model on synthetic datasets where SCA discovers the ground-truth about the latent components. We also apply SCA to a multiway classification task and a link prediction task.


Identifiable Autoregressive Variational Autoencoders for Nonlinear and Nonstationary Spatio-Temporal Blind Source Separation

Sipilä, Mika, Nordhausen, Klaus, Taskinen, Sara

arXiv.org Machine Learning

The modeling and prediction of multivariate spatio-temporal data involve numerous challenges. Dimension reduction methods can significantly simplify this process, provided that they account for the complex dependencies between variables and across time and space. Nonlinear blind source separation has emerged as a promising approach, particularly following recent advances in identifiability results. Building on these developments, we introduce the identifiable autoregressive variational autoen-coder, which ensures the identifiability of latent components consisting of nonstationary autoregressive processes. The blind source separation efficacy of the proposed method is showcased through a simulation study, where it is compared against state-of-the-art methods, and the spatio-temporal prediction performance is evaluated against several competitors on air pollution and weather datasets.



Provable Subspace Identification Under Post-Nonlinear Mixtures

Neural Information Processing Systems

UML is known to be challenging: Even learning linear mixtures requires highly nontrivial analytical tools, e.g., independent component analysis or nonnegative matrix factorization.


From What to How: Attributing CLIP's Latent Components Reveals Unexpected Semantic Reliance

Dreyer, Maximilian, Hufe, Lorenz, Berend, Jim, Wiegand, Thomas, Lapuschkin, Sebastian, Samek, Wojciech

arXiv.org Artificial Intelligence

Transformer-based CLIP models are widely used for text-image probing and feature extraction, making it relevant to understand the internal mechanisms behind their predictions. While recent works show that Sparse Autoencoders (SAEs) yield interpretable latent components, they focus on what these encode and miss how they drive predictions. We introduce a scalable framework that reveals what latent components activate for, how they align with expected semantics, and how important they are to predictions. To achieve this, we adapt attribution patching for instance-wise component attributions in CLIP and highlight key faithfulness limitations of the widely used Logit Lens technique. By combining attributions with semantic alignment scores, we can automatically uncover reliance on components that encode semantically unexpected or spurious concepts. Applied across multiple CLIP variants, our method uncovers hundreds of surprising components linked to polysemous words, compound nouns, visual typography and dataset artifacts. While text embeddings remain prone to semantic ambiguity, they are more robust to spurious correlations compared to linear classifiers trained on image embeddings. A case study on skin lesion detection highlights how such classifiers can amplify hidden shortcuts, underscoring the need for holistic, mechanistic interpretability. We provide code at https://github.com/maxdreyer/attributing-clip.


Latent Mode Decomposition

Morante, Manuel, Rehman, Naveed ur

arXiv.org Artificial Intelligence

--We introduce V ariational Latent Mode Decomposition (VLMD), a new algorithm for extracting oscillatory modes and associated connectivity structures from multivariate signals. VLMD addresses key limitations of existing Multivariate Mode Decomposition (MMD) techniques--including high computational cost, sensitivity to parameter choices, and weak modeling of interchannel dependencies. Its improved performance is driven by a novel underlying model, Latent Mode Decomposition (LMD), which blends sparse coding and mode decomposition to represent multichannel signals as sparse linear combinations of shared latent components composed of AM-FM oscillatory modes. This formulation enables VLMD to operate in a lower-dimensional latent space, enhancing robustness to noise, scalability, and interpretability. The algorithm solves a constrained variational optimization problem that jointly enforces reconstruction fidelity, sparsity, and frequency regularization. Experiments on synthetic and real-world datasets demonstrate that VLMD outperforms state-of-the-art MMD methods in accuracy, efficiency, and the interpretability of extracted structures. ONST A TIONARY signal decomposition techniques constitute an essential tool in Signal Processing for analyzing complex signals. Among them, Mode Decomposition (MD) has emerged as a fundamental framework, enabling the extraction of meaningful intrinsic oscillatory components [1]. Over the last couple of decades, a wide range of MD methods and algorithms have been developed and successfully applied across a wide range of interdisciplinary applications, such as biomedical signal analysis, structural health monitoring, and financial time-series analysis. This particular trend dates back to the late nineties with the introduction of Empirical Mode Decomposition (EMD) [1], which was followed by the development of other similar alternatives, such as Synchro-squeezed Transform (SST) [2], V ariational Mode Decomposition (VMD) [3] and Sliding-window Singular Spectrum Analysis (SSA) [4]. Originally designed for single-channel time series analysis, some of these methods were later extended to handle multivariate time series. Notable multivariate algorithms include Multivariate Empirical Mode Decomposition (MEMD) [5], multivariate nonlinear chirp mode decomposition [6], iterative filtering [7], as well as Multivariate V ariational Mode Decomposition (MVMD) [8].


Interpreting and Steering Protein Language Models through Sparse Autoencoders

Garcia, Edith Natalia Villegas, Ansuini, Alessio

arXiv.org Artificial Intelligence

The rapid advancements in transformer-based language models have revolutionized natural language processing, yet understanding the internal mechanisms of these models remains a significant challenge. This paper explores the application of sparse autoencoders (SAE) to interpret the internal representations of protein language models, specifically focusing on the ESM-2 8M parameter model. By performing a statistical analysis on each latent component's relevance to distinct protein annotations, we identify potential interpretations linked to various protein characteristics, including transmembrane regions, binding sites, and specialized motifs. We then leverage these insights to guide sequence generation, shortlisting the relevant latent components that can steer the model towards desired targets such as zinc finger domains. This work contributes to the emerging field of mechanistic interpretability in biological sequence models, offering new perspectives on model steering for sequence design.